Non-invasive fractional flow reserve estimation using deep learning on intermediate left anterior descending coronary artery lesion angiography images

This study aimed to design an end-to-end deep learning model for estimating the value of fractional flow reserve (FFR) using angiography images to classify left anterior descending (LAD) branch angiography images with average stenosis between 50 and 70% into two categories: FFR > 80 and FFR ≤ 80. In this study 3625 images were extracted from 41 patients’ angiography films. Nine pre-trained convolutional neural networks (CNN), including DenseNet121, InceptionResNetV2, VGG16, VGG19, ResNet50V2, Xception, MobileNetV3Large, DenseNet201, and DenseNet169, were used to extract the features of images. DenseNet169 indicated higher performance compared to other networks. AUC, Accuracy, Sensitivity, Specificity, Precision, and F1-score of the proposed DenseNet169 network were 0.81, 0.81, 0.86, 0.75, 0.82, and 0.84, respectively. The deep learning-based method proposed in this study can non-invasively and consistently estimate FFR from angiographic images, offering significant clinical potential for diagnosing and treating coronary artery disease by combining anatomical and physiological parameters.


Population
This retrospective cross-sectional study was conducted in 2023.The angiographic images of 41 patients who underwent angiography and FFR on the left anterior descending (LAD) coronary artery and were referred to a cardiac center between 2015 and 2022 were used in this study.Patients were referred for angiography based on symptoms such as chest pain or shortness of breath, as well as risk factors like family history, smoking, high cholesterol, etc., suggesting a preliminary diagnosis of coronary artery disease.Angiography was requested for further evaluation based on clinical presentation and noninvasive testing such as stress testing.The study participants ages ranged from 42 to 57 years and 19 participants were female.The participants had no stenosis, coronary flow impairment, acute myocardial infarction, or history of open-heart surgery.FFR was performed to physiologically evaluate the lesions with a visual estimation of 50% and 70% of stenosis.The data were collected by reviewing the medical records and the angiography department's archive.All patients underwent coronary angiography through the femoral artery using a Judkins catheter and conventional imaging.Multiple physicians performed angiography in all cases, and Ultravist-370 (Schering, Berlin, Germany) was used as the contrast agent.The injection was done manually (6-8 ml of contrast agent per injection).Coronary pressure was measured using a 0.014-inch pressure wire (St.Jude Medical, USA).The wire was guided and calibrated using a guiding catheter and placed approximately three centimeters past the stenosis.Maximum hyperemia was induced by intravenous administration of adenosine (average dose of 120 µg).
All experimental protocols were approved by the Institutional Review Board of Shahid Beheshti University of Medical Sciences, with the approval code IR.SBMU.RETECH.REC.1401.665,and were performed in accordance with relevant guidelines and regulations.Informed consent was obtained from all subjects and/or their legal guardians.

Data structure
The training data used in this study consisted of 2390 images from 18 patients before and after revascularization (All of these patients underwent FFR procedure after revascularization surgery, and their FFR values were greater than 80).Given that the arterial structure of a patient before and after revascularization surgery is the same, and the only difference is the removal of stenosis and increase in flow at the site of the lesion, the angiography images of these patients before stenting were classified into the category of patients with FFR ≤ 80, and the images after revascularization surgery were classified into the category of patients with FFR > 80.Therefore, assuming that the proposed model is sensitive to these changes and learns the desired region of interest better, this category of images was selected as the training dataset.Additionally, for model evaluation, the test dataset consisted of 772 images from twenty-three patients, including 14 patients with FFR > 80 and nine patients with FFR ≤ 80, as described in Table 2.The before-and-after images of patients were not used in the test dataset, and the images in each category in this dataset only included unique images of unique patients to have a fair and unbiased evaluation of the model.Figure 1 shows a patient's FFR value before and after revascularization surgery and changes in the region of interest (ROI) indicated with a red circle in the image.

Data preparation
An interventional cardiologist evaluated the angiography films of patients, and a total of 3625 black and white images related to the LAD artery from forty-one patients were included in the study, each measuring 512 × 512 pixels.This study classified patients into FFRH class for FFR >    www.nature.com/scientificreports/

Proposed method
Figure 2 illustrates the structure of the proposed method.First, pre-processing was performed on the input images, including decoding, resizing, normalization, augmentation, and histogram equalization.Then, the feature extractor inserted the obtained feature vector into the classifier block, and finally, the images were divided into two classes: FFR > 80 and FFR ≤ 80.

Reference (Year) Modality
Number Resizing.The image size of 380 × 380 pixels was selected using Grid search.
Data normalization.Normalization was applied to all images before entering the network.The data were normalized to reduce the effect of intensity variations between radiographs.Normalization involves scaling the pixel values of images to a standard range or mean and unit variance to reduce the impact of varying lighting conditions on the image.Scaling involves rescaling the data to have similar units so that no feature dominates another 67 .
For data normalization, first, the pixel-level global mean and standard deviation (SD) were calculated for all the images; next, the data were normalized using Eq. 1 where μ is the global mean of the image set X, σ is the SD, ε = 1e − 10 is an insignificant value to prevent the denominator from turning zero, i = [1 − 2083] is the index of each training sample, and Z i is the normalized version of X i (41).
Augmentation.Data augmentation is essential in deep learning models.It involves generalizing the training samples by transforming images without losing their semantic and intrinsic information.These transformations were randomly applied to the data 68,69 .
Data augmentation involves creating more training examples by transforming existing images through rotation, translation, contrast change, and zooming techniques.
Table 3 shows data augmentation techniques and the parameters used in this study.
Histogram equalization.The histogram information was used, and the most common intensity values were dispersed to produce a contrast-improved image 70 .Histogram equalization was performed using Eq. 2, where L is the maximum intensity level of the image; M: is the width of the image; N: is the height of the image; N: is the frequency corresponding to each intensity level; r j : the range of values from 0 to L-1; P in : the total frequency that corresponds to a specific value of r j ; Rk: the new frequencies; S k : The new equalized histogram; where k = 0,1,2, ……, L − 113.
This study used this technique to adjust the contrast of the input image.Figure 3 shows an example of using this technique.

Model architecture
The proposed model consisted of feature extraction and classification blocks, explained in the following.Feature extractor.Nine famous pre-trained CNNs were used for image feature extraction, including DenseNet121 71 , InceptionResNetV2 72 , VGG16 73 , VGG19 73 , ResNet50V2 74 , Xception 75 , DenseNet201 71 , DenseNet169 71 , and MobileNetV3Large 76 .After running these networks on the dataset and evaluating them, DenseNet169 showed the best performance.This architecture consists of a convolutional layer, a pooling layer, four dense blocks, and three transition layers.the 4 dense blocks and 3 transition layers have been delineated separately using distinct boxes to showcase the individual components.For each dense block, the number of constituent layers is also indicated.For instance, Dense Block 1 is composed of 6 layers, with each layer utilizing batch normalization (BN), ReLU activation, followed by 1 × 1 and 3 × 3 convolutional filters of size 64.The subsequent Dense Blocks 2, 3 and 4 progressively increase the layers, while maintaining an identical structure of (1) Table 3.Details on the data augmentation techniques and parameters.

Type Parameters
Random rotation batch normalization, ReLU activation, and convolutional filtering.Finally, the transition layers in between the dense blocks employ batch normalization, ReLU activation, and 1 × 1 convolutions with 128, 256, and 512 filters respectively.We believe these model architecture clarifications provide improved understanding of the underlying DenseNet169 infrastructure per the reviewer's suggestion.Please advise if further explanation or modification would be beneficial.Figure 4 illustrates the overall architecture of this network 71 .
Classifier.For classifying angiography images into two classes of FFR > 80 and FFR ≤ 80, a classifier block was designed, as shown in Fig. 5, in which two fully connected sequential blocks were used after the batch-normalization layer.
The first block consisted of dense, ReLU, Kernel Regularizer L1L2, batch-normalization, and dropout layers.The second block comprised dense, ReLU, and batch-normalization layers, respectively.Figure 6 displays these steps in detail.
The classifier was a dense layer with two neurons, and the Softmax function was applied to these representations.This function specified the probability of allocating each sample to one out of Two classes, and its value fell in the [0,1] range.Figure 5 displays these steps in detail.

Training and implementation
The feature extractor block was completely frozen using the transfer learning approach in the first training phase and included non-trainable parameters.This model was trained for several epochs with weights obtained after fitting the ImageNet dataset.However, all parameters of the classifier block were trainable.
The first training phase used the Adam optimizer with an initial learning rate of 1e-2 and a decay rate of 1e-5.The Adam optimizer with an initial learning rate of 1e-4 and a decay rate of 1e-6 was used in the second training phase.In both training phases, cosine Annealing was used.In the second phase of fine-tuning, all network layers except for the first eight layers, the feature extractor, and the first convolutional block were trainable and frozen.
The training process consisted of 120 epochs in the first phase and 600 in the second phase.Early stopping was considered at ten epochs in the first and 100 in the second phases.In the second phase of training, validation loss was also monitored.If it remained constant for ten epochs and did not improve, the learning rate would decline by 20%.Validation accuracy was also monitored, and only the model with the best weights obtained was saved.The optimal hyperparameter values were obtained using grid search.The value of the kernel regularizer parameters was l1 = 1e-5 and l2 = 1e-4.These architectures were implemented using Python language and the Keras library and executed on Google's TPU v3-8. Figure 7 shows the training and validation loss after 238 iterations during the training process.
Loss function.Cross-entropy was used as the loss function, which is a metric for measuring the performance of a classification model in machine learning and is defined by Eq. 3, Where P(x) is the probability of the event x in P, Q(x) is the probability of event x in Q, and the log is the base-2 logarithm 77 .
(3) H(P, Q) = −sum x inXP(x) * log(Q(x))  Learning rate schedule.The learning rate schedule is a pre-defined framework that adjusts the learning rate between epochs or iterations to avoid getting stuck in the local optimum as training progresses.This study used a warm restart cosine annealing for the learning rate scheduling program, considering the best weights as the restart points.It is demonstrated in the following equation (Eq.4), where the best weights are considered as the restart points.
Within i-th run, the learning rate is decayed with a cosine annealing for each batch as follows: η i min and η i max are ranges for the learning rate, and Tcur accounts for how many epochs have been performed since the last restart.Since Tcur is updated at each batch iteration t, it can take discredited values such as 0.1 and 0.2.Thus, ηt = η i max when t = 0 and Tcur = 0 .Once Tcur = Ti , the cos function will output − 1, so ηt = η i min 78 .
Custom weighting.The unequal number of class samples, known as class imbalance, is an issue in machine learning classification problems.It affects the prediction model and leads to bias.Custom weighting was used to prevent this challenge, with a weight of 0.8 for the high-count class and 1.32 for the low-count class.These values represent the weighted average of the number of samples in each class.
Label smoothing.Label smoothing was used to improve the generalizability of the model.Label smoothing is an effective regularization tool for deep neural networks (DNNs) and can implicitly calibrate the model's predictions.It significantly impacts the model interpretability and improves model calibration and beam search.It accounts for the possible mistakes in datasets, so maximizing the likelihood of log p y|x can be directly harmful.For a small constant ε, the training set label y is correct with the probability of 1-ε and incorrect otherwise.Label Smoothing regularizes a model based on a Softmax with k output values by replacing the hard 0 and 1 classification targets with targets of ε k−1 , respectively 76,[79][80][81] .
Techniques to prevent overfitting.Overfitting is a fundamental problem in supervised machine learning, preventing models from perfectly generalizing to observed training data and unseen test set data.Overfitting occurs due to noise, limited training set size, and classifier complexity 82 .In order to address concerns related to potential overfitting in our model, several regularization techniques were strategically incorporated during the model development phase.Batch Normalization was applied to normalize the activations of various layers, enhancing the stability of the learning process.Additionally, Dropout with a rate of 0.2 was implemented on specific layers to introduce a level of randomness, preventing the model from relying too heavily on specific features present in the training set.Furthermore, L1L2 Kernel Regularizer was employed on the Dense layer with carefully chosen coefficients to penalize large weights and reduce model complexity.These regularization techniques collectively contribute to the robustness of our model by striking a balance between fitting the training data and generalizing well to new, unseen data.The effectiveness of these measures is evident in the model's performance, as illustrated in Fig. 7 and discussed in the results section.
Mixed precision.Mixed precision decreased fitting/training time and reduced memory usage during training.Figure 6 illustrates the mechanism of this method.

Ethical approval
All experimental protocols were approved by the Institutional Review Board of Shahid Beheshti University of Medical Sciences, with the approval code IR.SBMU.RETECH.REC.1401.665,and informed consent was obtained from all subjects and/or their legal guardians.

Experiments
In this section, the performance evaluation parameters of the model are first explained, then the proposed method's performance is evaluated, and the model training results are reported.Furthermore, various wellknown pre-trained networks were also used, and their training results were compared with the proposed method.

Evaluation metrics
Evaluation metrics are different types of measures to evaluate the performance of a deep learning model.They are mainly Accuracy (3), Precision (4), Recall (4), F-Measure (6), and Specificity.The number of true-positive (TP), false-positive (FP), true-negative (TN), and false-negative (FN) values are required to measure these parameters, as mentioned below.

Model evaluation
In this section, the evaluation results of the model on the test dataset were reported.For evaluating the proposed model, the cross-validation method was used.Cross-validation is a statistical method for evaluating and comparing learning algorithms by dividing the data into model training and validation [84][85][86] .The main form of cross-validation is k-fold cross-validation, where k equals the number of folds.This type of validation is performed as follows: In each iteration, one or more learning algorithms use k = 1 folds of data to learn one or more models, and subsequently, the learned models are asked to make predictions about the data in the validation fold.The performance of each learning algorithm on each fold can be tracked using some predetermined performance metric like accuracy.Different methodologies, such as averaging, can be used to obtain an aggregate measure from these samples, or these samples can be used in a statistical hypothesis test to show that one algorithm is superior to another.
This study used five-fold cross-validation to validate the proposed model.The final results of evaluating the proposed model using this method are reported in Table 4 and Fig. 8.
The Receiver Operating Characteristic (ROC) curve in Fig. 9 illustrates the model's performance for Fractional Flow Reserve (FFR) with an Area Under the Curve (AUC) of 0.81.This AUC value signifies a strong discriminatory capacity, effectively distinguishing between FFR > 80 and FFR < = 80 classes.Specifically, the model excels in discerning FFR > 80 and FFR < = 80 classes, as indicated by the AUC value.The 95% confidence interval for the AUC, [0.777, 0.833], ensures the precision of this discrimination.Moreover, the exceedingly low p-value (< 0.001) underscores the model's statistical significance, indicating a substantial and meaningful difference compared to the baseline value of 0.5.www.nature.com/scientificreports/

Review and comparison of pre-trained feature extractors
Nine pre-trained CNNs, including DenseNet121, InceptionResNetV2, VGG16, VGG19, ResNet50V2, Xception, MobileNetV3Large, DenseNet201, and DenseNet169 (Proposed), were used for image feature extraction and were evaluated with the test dataset.These models were compared based on the accuracy parameter.Table 5 shows the obtained results.
The performance outcomes from assessing the three highest-accuracy models using the evaluation the test data are presented in Table 6.

Discussion
In the present study, a fast, end-to-end, automated deep learning model was designed for estimating FFR values using angiography images.This model can classify angiography images into two classes, FFR > 80 and FFR < = 80, with no manual annotation and an overall accuracy of 81%.Multiple studies have shown a correlation between anatomical and physiological parameters 87,88 , and the current study's findings also provide further insights into how angiography features affect FFR values.
Although angiography is the gold standard for evaluating the severity of coronary lesions, physiological evaluation is the determining factor for treatment planning in patients with coronary artery disease 89 .FFR is considered the gold standard for the physiological assessment of coronary artery stenosis and is a strong indicator for diagnosis, treatment, and determining the approach for interventions.However, the invasive nature of FFR evaluation and its high cost has led to a lack of enthusiasm among healthcare professionals to use this method routinely in the Cath lab.The proposed method in this study has the potential to be used routinely in Cath labs due to its low cost, no need for additional data entry or extra workload for the cardiologist, online usability, and no need for changes in workflow in the Cath lab.However, this method requires external validation.External evaluation in deep learning checks a model's performance on new, distinct data, ensuring its generalization and minimizing overfitting for real-world applications [90][91][92] .
The present study shows that in recent years, significant efforts have been made to integrate anatomical and physiological parameters, indicating this method's clinical value for physicians and patients.However, integrating anatomical and physiological parameters is a significant challenge 93 .Various methods have been developed to calculate FFR without an invasive pressure wire or inducing hyperemia 31 .The present study's findings also demonstrate that image-based deep learning for determining FFR is a non-invasive and cost-effective method that can be used to match the visual and physiological features of coronary artery stenosis.
In recent years, an end-to-end framework has been introduced in deep learning, and its benefits in the health field have been investigated 94,95 .This study's proposed model demonstrates the advantages of using this approach for estimating FFR.Physicians can use this model to evaluate physiological conditions without entering additional data and manual annotation, only by inputting angiography images.Additionally, to facilitate the successful implementation of this method in Cath labs, systems based on this model can display FFR values online.On the other hand, the FAME study shows that only 35% of patients with stenosis between 50 and 70% are found to be significant stenosis in FFR evaluation.In other words, a model that can detect more insignificant stenosis will result in fewer unnecessary FFRs.
The existence of a non-invasive method for reducing unnecessary FFRs is also very important, and artificial intelligence, due to its non-invasiveness and the lack of need to change the workflow of the Cardiac catheterization laboratory, can be an excellent solution.This highlights the potential value of an accurate non-invasive AI-based FFR estimation approach.Such a method could help avoid unnecessary invasive FFR procedures and their associated costs and complications in cases where non-invasive assessment predicts non-significant stenosis.This is particularly relevant given that studies show only a subset of intermediate coronary lesions are found to be hemodynamically significant when measured invasively.More widespread adoption of validated non-invasive FFR estimation techniques may improve clinical workflows and benefit both patients and healthcare systems.
In the present study, the DenseNet169 model outperformed other models in detection of insignificant stenosis.Compared to other studies in this field, our proposed method requires only a single view from the angiography image with no need for annotation or additional parameters, without altering existing clinical workflows, yet still achieves state-of-the-art performance by utilizing a deep learning approach.

Study limitations and future considerations
While our study provides valuable insights into FFR estimation using angiography images, it is essential to acknowledge certain limitations.Firstly, the relatively small sample size of 41 patients might impact the generalizability of our findings.Future research endeavors should prioritize the inclusion of a larger and more diverse cohort to enhance the robustness and external validity of the proposed model.Additionally, this study focused solely on the parameters present in angiography images, omitting potential influential factors such as age and gender.The exclusion of these variables may limit the comprehensive understanding of FFR estimation.Future investigations could explore the incorporation of additional clinical parameters to refine and expand the predictive capabilities of the model.External evaluation of our method on independent datasets will also be important to further validate the generalizability of our findings.External evaluation is something that will be a focus of our future work.

Conclusion
This study designed an intelligent, fast, end-to-end, and automated method using the CNN architecture, the concept of transfer learning, and the pre-trained DenseNet169 network for estimating FFR values based on angiography images.This model can estimate FFR non-invasively with an overall accuracy of 81%.DL-based angiography image-derived FFR is a valuable tool for decision-making in diagnosing and treating stenosis in Cath labs.This model can assist cardiologists in decisions about diagnosis and treatment of moderate stenosis by combining physiological and anatomical parameters of coronary arteries.

Figure 3 .
Figure 3. X-ray image before and after histogram equalization.

Figure 6 .
Figure 6.Mixed precision training iteration for a layer 83 .

Figure 7 .
Figure 7.The loss of the proposed model during training.Model converged after 238 epochs.

Table 2 .
66e dataset used for training and testing the proposed model.PreprocessingPre-processing is an essential step in deep learning that involves transforming and preparing raw data for effective utilization by a neural network66.It involves various techniques such as decoding, resizing, normalization, augmentation, and histogram equalization.Decoding.Image decoding is converting the encoded image back to an uncompressed bitmap.The attribute channels indicate the decoded image's desired number of color channels.

Table 4 .
The proposed model's evaluation results in DenseNet-169 Network.

Sensitivity Specificity Precision F1-Score Support
Figure 8. Confusion matrix of model evaluation on the test data set.Vol:.(1234567890)

Table 5 .
Comparison of the prediction accuracy of the proposed model on the test set using different pretrained networks as feature extractors.

Table 6 .
Evaluation results for top 3 pre-trained feature extractors.